Online Relative Entropy Policy Search using Reproducing Kernel Hilbert Space Embeddings
نویسندگان
چکیده
Kernel methods have been successfully applied to reinforcement learning problems to address some challenges such as high dimensional and continuous states, value function approximation and state transition probability modeling. In this paper, we develop an online policy search algorithm based on a recent state-of-the-art algorithm REPS-RKHS that uses conditional kernel embeddings. Our online algorithm inherits the advantages of REPS-RKHS, including the ability to learn non-parametric control policies for infinite horizon continuous MDPs with highdimensional sensory representations. Different from the original REPS-RKHS algorithm which is based on batch learning, the proposed online algorithm updates the model in an online fashion and thus is able to capture and respond to rapid changes in the system dynamics. In addition, the online update operation takes constant time (i.e., independent of the sample size n), which is much more efficient computationally and allows the policy to be continuously revised. Experiments on different domains are conducted and results show that our online algorithm outperforms the original algorithm.
منابع مشابه
Hilbert Space Embeddings of POMDPs
A nonparametric approach for policy learning for POMDPs is proposed. The approach represents distributions over the states, observations, and actions as embeddings in feature spaces, which are reproducing kernel Hilbert spaces. Distributions over states given the observations are obtained by applying the kernel Bayes’ rule to these distribution embeddings. Policies and value functions are defin...
متن کاملReproducing Kernel Space Hilbert Method for Solving Generalized Burgers Equation
In this paper, we present a new method for solving Reproducing Kernel Space (RKS) theory, and iterative algorithm for solving Generalized Burgers Equation (GBE) is presented. The analytical solution is shown in a series in a RKS, and the approximate solution u(x,t) is constructed by truncating the series. The convergence of u(x,t) to the analytical solution is also proved.
متن کاملSolving multi-order fractional differential equations by reproducing kernel Hilbert space method
In this paper we propose a relatively new semi-analytical technique to approximate the solution of nonlinear multi-order fractional differential equations (FDEs). We present some results concerning to the uniqueness of solution of nonlinear multi-order FDEs and discuss the existence of solution for nonlinear multi-order FDEs in reproducing kernel Hilbert space (RKHS). We further give an error a...
متن کاملSolving Fuzzy Impulsive Fractional Differential Equations by Reproducing Kernel Hilbert Space Method
The aim of this paper is to use the Reproducing kernel Hilbert Space Method (RKHSM) to solve the linear and nonlinear fuzzy impulsive fractional differential equations. Finding the numerical solutionsof this class of equations are a difficult topic to analyze. In this study, convergence analysis, estimations error and bounds errors are discussed in detail under some hypotheses which provi...
متن کاملSome Properties of Reproducing Kernel Banach and Hilbert Spaces
This paper is devoted to the study of reproducing kernel Hilbert spaces. We focus on multipliers of reproducing kernel Banach and Hilbert spaces. In particular, we try to extend this concept and prove some related theorems. Moreover, we focus on reproducing kernels in vector-valued reproducing kernel Hilbert spaces. In particular, we extend reproducing kernels to relative reproducing kernels an...
متن کامل